Noise reduction by coupling of stochastic processes and canalization in biology 
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Randomness is an unavoidable feature of the intracellu- 
lar environment due to chemical reactants being present 
in low copy number. That phenomenon, predicted by 
Delbriick long ago [l|] , has been detected in both prokary- 
otic 0, Q and eukaryotic 0] cells after the development 
of the fluorescence techniques. On the other hand, de- 
veloping organisms, e.g. D. melanogaster, exhibit strik- 
ingly precise spatio-temporal patterns of protein/mRNA 
concentrations Those two characteristics of liv- 

ing organisms are in apparent contradiction: the precise 
patterns of protein concentrations are the result of multi- 
ple mutually interacting random chemical reactions. The 
main question is to establish biochemical mechanisms for 
coupling random reactions so that canalization, or fluc- 
tuations reduction instead of amplification, takes place. 
Here we explore a model for coupling two stochastic pro- 
cesses where the noise of the combined process can be 
smaller than that of the isolated ones. Such a canaliza- 
tion occurs if, and only if, there is negative covariance be- 
tween the random variables of the model. Our results are 
obtained in the framework of a master equation for a neg- 
atively self-regulated - or externally regulated - binary 
gene and show that the precise control due to negative 
self regulation [§] is because it may generate negative co- 
variance. Our results suggest that negative covariance, in 
the coupling of random chemical reactions, is a theoreti- 
cal mechanism underlying the precision of developmental 
processes. 

We approach the stochastic model for a binary gene 
operating under negative self-regulation or external reg- 
ulation 1(| 11 1 . Analytic solutions for the steady state 
liL lU and the dynamic [lil HH regimes, as well as the 
symmetries Il7| underlying solubility have already 
been presented. This system has the transcription and 
translation treated as combined processes. The state of 
the system is defined by two stochastic variables, the gene 
state (activate or repressed) and the protein number in 
the cytoplasm. The gene state is defined effectively in 
terms of its promoter site as on/off under the action of 



an external agent, e.g. a protein codified by another gene, 
or by self-interaction. 

We consider a stochastic formulation for the dynamics 
of the probability of finding the gene in an active (or 
repressed) state, indicated by a n (or (3 n ) when n proteins 
are found in the cytoplasm. The protein synthesis rates 
are given by k or \k (0 < % < 1) when the gene is, 
respectively, turned "on" or "off" . The degradation rate 
of the proteins is given by p. The "off-on" switching rate 
is denoted by / while h\ and h 2 indicate the opposite 
transition rates. The master equation is: 



= fc(a n _i - a n ) + p[(n + l)a n+1 - na„ 

at 

- (hin + h 2 )a n + ff3 n , 



(1) 



+ (hin + h 2 )a n 



Pn) + p[(n + l)/3„+i - nl3 n } 

fPn- (2) 



The existence of an "on-off" transition dependent on n 
indicates a negative self- regulating gene (h 2 — 0). For an 
externally regulated gene, one assumes hi to be zero. The 
proportionality to n is effective and has no relationship 
with the actual biochemical mechanisms of protein bind- 
ing/unbinding to DNA regulatory regions, that might in- 
volve a plethora of chemical reactions. 

Eqs. (HJ) and (JSJ) might be considered as the coupling of 
two different Poissonian processes, each of them related 
to one of the gene states. The processes are coupled in 
terms of a second stochastic variable, the gene state. To 
each gene state, we associate the quantities Ni = k/p 
and N 2 = X^/p, that are the stationary averages of the 
isolated Poisson processes. Therefore, the biologically 
measurable quantities are the protein number in the cy- 
toplasm n and the protein synthesis rates {Ni,N%} and 
proceed to evaluating the noise on n. 

For that purpose, we start defining the moments of the 
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random variables [n, N) in terms of a n and fi n as: 



(3) 



The marginal probabilities of finding n proteins inside the 
cell are given by 4> n = a n + /3 n , and the protein synthesis 
rate has a probability px = J2n=o a « ( or P 2 = J2^=a Pn) 
to be Ni (or N 2 ). 

As we have two random variables, it is convenient to 
use the covariance between n and N - indicated by ^ n 
- for the analysis of the variance on the number of gene 
products, namely 



£n,» = (nN) — (n)(N). 



(4) 



The noise on the protein number of the composed sys- 
tem of the Eqs. (JTJ and @ is computed in terms of 
the Fano factor, that is defined as the ratio between the 
variance and the mean of n, 



(n 2 ) - (nf 
(n) 



(5) 



As a Poisson (or Fano) distribution has a Fano factor 
equals to unity, the Fano factor is used to determine 
how different from the Poissonian a probability distri- 
bution is. When T > 1 the distribution is spreader and 
named super-Poisson (or super-Fano) while it is named 
sub-Poisson (or sub- Fano) when T < 1. As it is shown 
at the supplementary material, at the stationary limit, 
the Fano factor can be reduced to 



T= 1 



W 



(6) 



which value depends on the signal of the covariance be- 
tween the two stochastic variables of the model. It is 
worth to mention that this relation for the Fano factor 
holds for a gene operating in two, three and so on, states 
of synthesis. 

Eq. © shows the possibility of occurrence of sub, 
super or Poissonian distributions when we deal with the 
probability distributions generated by the Eqs. (JTJ and 
depending on the value of £ n> ,. As we shall show 
below, for an external regulating gene, the covariance, 

, satisfies 



(7) 



The negatively self-regulating gene might have the co- 
variance, t° be 



a,i > 0, or £,< < 0. 



(8) 



Thus, while an externally regulated gene operates only 
on the super-Poisson and Poisson regimes, the negative 
self-regulated gene operates on both regimes plus the sub- 
Poissonian. 



A closed form for Eqs. (JTJ and (JSJ, for the binary 
gene, is written with the help of the exact solutions of 
the Eqs. Q and © 0, [H . Before writing the closed 
forms for the covariance for the externally regulated or 
self-interacting gene, we redefine the model's parameters 
as: 



P 



k 
P 



f_ 

P 



b = - + hl z + Nxz (l - Zo). 



(9) 



Particularly, one takes z Q — 1 (or hx — 0) and it results 



b = 



f + h-2 



(10) 



for the externally regulated gene and < a < b when 
one compares the Eqs. ((9]) and (fT0|) . The negative self- 
regulating gene has hi = and 



Nx(l -z )]z . 



(11) 



In the set of parameters {a, b, Zq} one takes Nx — 

— ^— and, due to positivity of Ni, we see that, for 

zo(l-zo) 

a fixed 6, a lies in the interval [0, b/zo]- The choice for a 
fixed b is because it is the invariant characterizing the Lie 
symmetry of the binary model [ill . Now we present 
explicit forms for the covariances, which demonstration 
is given at the supplementary material. 

The covariance of the extenally regulated gene (z = 1) 
is given by 



Nl 



a b 
lb~ 



>0, 



(12) 



since a < b. 

For negatively self-regulating gene (hi = 0) the co- 
variance has a more complicated form, given in terms of 
the protein mean number (n) = CNi(azo/b)M(a+ 1,6 + 
1,NxZq(1 — Zo)). Namely it is given as: 



6 s 



az - 



Nx 



1 - Zq 



(n) 2 , 



(13) 



where we used the Eq. ([§]). The reader should keep 
in mind that the average protein number is given as a 
function of the parameters (a, b, zq) and (n) is used for 
shortness. 

Fig. (pj A shows the Fano factor versus the aver- 
age protein number, for a fixed value of a, for the neg- 
ative self-interacting gene. Each line corresponds to a 
fixed value of b and variation of zq. As it have been 
demonstrated earlier 161 ] ■ the sub-Fano regimes occur 



only when a > b. Since we are exclusively interested 
on the sub-Fano regime we have investigated only the 
condition a > b, that corresponds to the negative covari- 
ance regime. Unexpectedly, there is a minimum value for 
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B. Fano factor B. Normalized covariance C. Probability distribution 
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FIG. 1. A. Fano factor versus the average protein number. Here we fixed a = 500 with different colors standing for fixed 
values of b as indicated on the legend. The Fano factor for the binary gene has 0.5 as an assymptotic limit, when the protein 
numbers tend to their maximum value, when zo — > 0. This limit corresponds to the condition of protein degradation very slow. 
The behaviour of J- is dependent on the value of b when zo — > b/a. It has a simple decay to 0.5 for b > 0.1 while it presents a 
pronounced minimum at (n) = 1 for the values of b < 0.1. Surprisingly, the lowest value for the Fano factor occurs on the limit 
of low molecules number. B. Normalized covariance versus the probability to the gene to be active. At that graphic 
we have plotted the Eq. (|14|) versus P2- We have fixed zo = 0.45 and, for each value of b, we varied a from zero to b/zo. The 
correlation is positive for a < b and negative for a > b. For a = b we have p2 = 1 — zo and the correlation is null. C. Probability 
distribution of the protein number for the negatively self-regulating gene (<j> n = Cj^- ( ~ Nl *°' 1 M(a + n, b + n, —Nzq)). The 
coupling between the "on" and "off" states sharpens the probability distribution as shown by the increase of the parameters a 
and zo- The constants (a, b, Zo) for the lines in black, blue, green, cyan and red, are, respectively, (5 x 10 3 , 1., 10 -4 ), (50., 50., 
0.5), (14., 70., 0.5), (1., 15., 0.95), (1., 2., 0.99). 



the Fano factor for the mean protein number equals to 
one. The Fano factor, for the higher values of the average 
protein number, is smaller than the Fano factor for the 
Poisson process by a factor 2. 

Fig. fll]) B shows the covariance, as calculated from 
the ratio between the covariance by the variance on n 
and the variance on TV, namely 

€?>/^fftft. (!4) 

versus the probability for the gene to be active, p\. The 
covariance is positive for p\ < zq (or a < b), when the 
gene has a high probability to stay repressed. The co- 
variance is zero for p\ = zq (or a = b) and 4> n shall be a 
Poissonian distribution. For the condition when p\ > zo 
(or a > b), the covariance is negative and one obtains a 
sub-Fano regime, with located probability distributions. 

Fig. ([J) C shows the effect of increasing the intensity 
of coupling of the two gene states onto the probability 
distribution <p n of finding n proteins inside the cell in the 
negatively self-regulating gene. The spreader distribu- 
tions, as shown by the colors cyan and red, correspond 
to the condition of a and zq, respectively, close to and 
1, that is the limit of low values for the switching rate 
(or coupling) constants / and hi at the Eqs. ([IJ and 
([2]). For intermediary values of the coupling constants, 
the distribution is still super- Poisson and gets thinner, as 
shown by the green colored line. A Poisson distribution 
is represented by the blue line. The limit when a > b and 
z ~ 0, for strong coupling, the probability gets highly 
located, as indicated by the black line. 

In the case of a sub-Poissonian process of protein syn- 
thesis, the Fano factor is smaller than that of the un- 



coupled Poissonian processes. Hence, the negative co- 
variance induces canalization when two stochastic pro- 
cesses are coupled. We suggest this as the theoretical 
mechanism underlying the higher precision of the neg- 
ative self- regulating gene @: the possibility of regimes 
where negative covariance between protein number and 
gene synthesis rates (or, equivalently, gene states) exists. 

Finally, the variance, a 2 , on the number of products of 
a gene operating in multiple modes of synthesis follows 
directly from the expression for the Fano factor as 

<r 2 = (n) + £ n ,i- (15) 

It is clear from this equation that the variance for the neg- 
atively self-regulating binary gene is smaller than that of 
Poissonian distribution, for a fixed average protein num- 
ber, when the covariance between the protein number 
and the gene state is negative. In other words, the prob- 
ability distribution on n shall be highly located around 
in). 

The existence of a negative covariance on a negatively 
self-regulating gene is intuitively predicted from the ana- 
lyzis of the Eqs. ([l} and (|2|). The coupling between these 
two equations is given as a function of n. That is inter- 
preted as follows, the higher the number of proteins in the 
cytoplasm of the cell, the higher the probability for the 
gene to switch to the repressed state and, consequently, 
to have a lower value for the protein synthesis rate, i.e., 
an increase in n might decrease N. Despite not easy to 
demonstrate, it is tempting to conjecture the existence of 
negative covariance regimes in negatively self-regulating 
genes operating in more than two modes of expressio. 

Biologically, the cell processes requiring higher preci- 
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sion would have a biochemical machinery that implement 
the negative covariance. Furthermore, one would expect 
the gene to switch multiple times during a time inter- 
val without a significant change of the protein number. 
Under these two assumptions it is expectable that the 
variance on the protein number to be small. 

In summary, in this manuscript we have shown that 
the higher precision on the number of gene products by 
the stochastic gene under negative self-regulation is due 
to the negative covariance between two random variables: 
protein number and protein synthesis rate. Our results 
suggest this as a general mechanism underlying the vari- 
ance reduction (or canalization) in the cell environment. 
Further research should enlighten the biochemical imple- 
mentation of negative covariance in networks or cascades 
of biochemical reactions. Experimental verification of our 
results would employ detection of both gene activation 
and protein numbers and analysis of their covariance. 
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